Siamese Autoencoders for Speech Style Extraction and Switching Applied to Voice Identification and Conversion

نویسندگان

  • Seyed Hamidreza Mohammadi
  • Alexander Kain
چکیده

We propose an architecture called siamese autoencoders for extracting and switching pre-determined styles of speech signals while retaining the content. We apply this architecture to a voice conversion task in which we define the content to be the linguistic message and the style to be the speaker’s voice. We assume two or more data streams with the same content but unique styles. The architecture is composed of two or more separate but shared-weight autoencoders that are joined by loss functions at the hidden layers. A hidden vector is composed of style and content sub-vectors and the loss functions constrain the encodings to decompose style and content. We can select an intended target speaker either by supplying the associated style vector, or by extracting a new style vector from a new utterance, using a proposed style extraction algorithm. We focus on in-training speakers but perform some initial experiments for out-of-training speakers as well. We propose and study several types of loss functions. The experiment results show that the proposed many-to-many model is able to convert voices successfully; however, its performance does not surpass that of the state-of-the-art one-to-one model’s.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Context-based Statistical Models to Promote the Quality of Voice Conversion Systems

This article aims to examine methods of optimizing GMM-based voice conversion systems performance in which GMM method is introduced as the basic method for improvement of voice conversion systems performance. In the current methods, due to using a single conversion function to convert all speech units and subsequent spectral smoothing arising from statistical averaging, we will observe quality ...

متن کامل

طراحی یک روش آموزش ناموازی جدید برای تبدیل گفتار با عملکردی بهتر از آموزش موازی

Introduction: The art of voice mimicking by computers, has with the computer have been one of the most challenging topics of speech processing in recent years. The system of voice conversion has two sides. In one side, the speaker is the source that his or her voice has been changed for mimicking the target speaker’s voice (which is on the other side). Two methods of p...

متن کامل

Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques

One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...

متن کامل

بررسی برخی ویژگی های آکوستیک گفتار نوزاد مدار در مادران فارسی زبان

Introduction: When adults talk to another person, linguistic characteristics of the listener will also be considered. A clear example of speech changes depending on the listener is maternal or infant directed speech. Infant directed speech is more slowly with longer sentences and pauses at the end of the utterance. Undoubtedly the most distinctive feature of this style of speech is acoustic c...

متن کامل

On Using Backpropagation for Speech Texture Generation and Voice Conversion

Inspired by recent work on neural network image generation which rely on backpropagation towards the network inputs, we present a proof-of-concept system for speech texture synthesis and voice conversion based on two mechanisms: approximate inversion of the representation learned by a speech recognition neural network, and on matching statistics of neuron activations between different source an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017